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Gametophytic self-incompatibility (GSI) is a mechanism in flowering plants, to prevent 
inbreeding and promote outcrossing. GSI is under the control of a specific locus, known as 
the S-locus, which contains at least two genes, the RNase and the SFB. Active S-RNases 
in the style are essential for rejection of haploid pollen, when the pollen S-allele matches 
one of two S-alleles of the diploid pistil. However, the nature of their mutual interactions at 
genetic and biochemical levels remain unclear. Thus, detailed understanding of the protein 
structure involved in GSI may help in discovering how the proteins involved in GSI may 
function and how they fulfill their biological roles. To this end, 3D models of the SC (Sf) and 
two SI (Ss and S23) S-RNases of almond were constructed, using comparative modeling 
tools. The modeled structures consisted of mixed a and p folds, with six helices and six 
p-strands. However, the self-compatible (Sf) RNase contained an additional extended loop 
between the conserved domains RC4 and C5, which may be involved in the manifestation 
of self-compatibility in almond. 
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INTRODUCTION 

Most almond {Prunus amygdalus Batsch) cultivars are self- 
incompatible (SI; Socias i Company, 1990). SI in the Prunus 
species shows the gametophytic self-incompatibility (GSI) sys- 
tem, controlled by a single polymorphic locus containing at 
least two linked genes, one specifically expressed in the pistil 
and the other in the pollen (Kao and Tsukamoto, 2004). Pollen 
tube growth is arrested in the style whenever the single S allele 
expressed in the haploid pollen matches one of the two S haplo- 
types expressed in the diploid pistil tissue. The pistil component 
of SI in Rosaceae, Solanaceae, and Plantaginaceae has been deter- 
mined to be an S- RNase (McClure etal., 1989). The Prunus 
S-RNase is of the T2-type (Igic and Kohn, 2001), with five con- 
served domains (CI, C2, C3, RC4, and C5) and one hyper-variable 
region (Sassa etal., 1997). The candidate gene for the pollen com- 
ponent in almond has been identified to be an SFB by Ushijima 
etal. (2003), showing a tight association with the S-RNase gene 
(Ikeda etal, 2005). 

In spite of the knowledge on the genetic structure of the female 
and male determinants in SI, the nature of their interactions 
remains unclear. The S-RNases are proteins and, as such, are built 
from sequences of amino acid residues, encoded by the corre- 
sponding gene. The linked amino acid residues bond in space 
to form a 3D structure. The knowledge of the 3D structure has 
been useful in order to understand how some proteins work and 
which molecular mechanisms underpin their function. Thus, the 
3D structure of the S- RNase proteins involved in SI may shed light 
on elucidating the recognition mechanism of GSI in the Rosaceous 



species at the molecular levels to understand how these proteins 
mediate the GSI function to fulfill their biological roles. 

Protein structure can be determined experimentally using 
X-ray crystallography, nuclear magnetic resonance spectroscopy, 
and cryoelectron microscopy, but these approaches are time- 
consuming (Ida etal, 2001). Consequently, predictive computer 
molecular modeling has been considered as a useful alternative. 
Molecular modeling may be defined as the science and/or art 
that defines molecular structure and function and that yields a 
3D model through computation. Protein structures are guided by 
two sets of principles operating on vastly different time scales. 
The first set of principles is defined by the laws of physics, 
while the second set is directed by the theory of evolution. Each 
of these two sets of principles has led to the development of 
predictive methods to build 3D protein models (Hrmova and 
Fincher, 2009). 

Currently, one of the most popular comparative modeling pro- 
grams is MODELLER (Sali and Blundell, 1993). It is a computer 
program that models 3D structures of proteins and their assem- 
blies by satisfaction of spatial restraints. The user provides an 
alignment of a sequence to be modeled with related 3D structures 
already known and MODELLER will calculate a new 3D model of 
a target protein. The array of 10-50 models typically produced by 
MODELLER can be evaluated to assess the stereo -chemical qual- 
ity and the energy profiles of protein models. Thus, after selecting 
the best model, the structure needs to be put in perspective with 
a biological function and tested to see if the model is helpful in 
proposing a useful hypothesis in biology. 
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Consequently, our objective was to identify the 3D structures 
of the almond 5-RNases and SFBs through molecular modeling 
tools and to investigate a link between their 3D structures and the 
SI mechanism. 

MATERIALS AND METHODS 

Three different 5-RNases from two almond cultivars were modeled 
because their sequences and physiological activity were available 
(Fernandez i Marti etal., 2009). The S-RNase sequences have 
been deposited in the EMBL/DDBJ/GenBank under AB467371 
(S/-RNase from "Blanquerna"), AB481108 (Ss-RNase from "Blan- 
querna"), and AB488496 (S23-RNase from "Vivot"). 

The modeling procedure started with the alignment of the 
sequence to be modeled (target) with related known 3D struc- 
tures (template) derived from the Protein Data Bank (PDB) using 
FASTA and BLAST (EMBL nucleotide database). In this proce- 
dure, the template to be selected among all possibilities must show 
the highest identity with the target, at least higher than 35%. The 
coordinates of this template protein were used as a template for 
further modeling. 

Once the best candidate template was selected, the sequence 
adjustment between the S-RNase sequences and the template 
was performed manually to minimize the number of gaps and 



insertions/deletions. The frame of the 3D model was constructed 
by MODELLER 9v5. A total of 40 models were constructed for 
each S-RNase. The four models with the lowest value of the Mod- 
eller objective function were chosen for further refinement. Energy 
function was evaluated through PROSAIIv3 (Sippl, 1993). This 
program detects errors in protein structures and thus serves to 
indicate their quality. 

On the other hand, stereo-chemical quality and the overall G- 
factors of the protein models were calculated using PRO CHECK 
(Laskowski etal., 1993). This software compares the residue-by- 
residue geometry of a set of closely related structures. The models 
with lower number of amino acid residues in disallowed regions 
were selected as the most suitable models. A Ramachandran plot 
(also known as a Ramachandran map or a Ramachandran dia- 
gram) outputted by PROCHECK visualizes dihedral angles \|; 
against cp of amino acid residues in the protein structure, thus 
showing the possible conformation of \|; and cp angles for a 
polypeptide (Ramachandran etal, 1963). 

During a further modeling, the loop refinement proto- 
col was used to generate 40 new models from the previously 
best model. The same steps as described above were followed, 
selecting the best four models according to their lowest val- 
ues of the Modeller objective function, and then selecting the 
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FIGURE 1 |The Ramachandran plot of sfi1_BL00040001 (A), and sfi1_BL00010001 (B). 
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"best of the best" from the results obtained by PROSAIIv3 
and PROCHECK. Finally, the molecular graphics were gener- 
ated with PYMOL, which visualizes protein structures (http:// 
www.pymol.org). 

RESULTS AND DISCUSSION 

The 3D models of the Sy-, S23-, and Sg-RNases were compared 
with the related known 3D structures derived from PDB. The 
best candidate template selected was the RNase MCI mutant with 
accession IJIG (Numata et al., 2003), because the identity between 
this template and the target sequences was 42%. On the other 
hand, the SFBy^, SFBs, and SFB23 models could not be generated 
because sequence identity higher than 30% was not found 
in PDB. 

Protein structures represent combinations of secondary struc- 
tural elements, a-helices and p-strands that are inter- connected by 
loops. These structural elements form the core regions (the inside 
of the molecule) and are connected by loop regions on the pro- 
tein surface with surface-exposed a-helices and ^-strands The 
structure of the S-RNases belonged to the a and P class, with 
six a-helices and six P -strands connected by loops. The folding 
topologies of its main chains were very similar to the topologies 
of the RNase T2 family enzymes. Their overall dimensions were 
approximately 40 A x 50 A x 30 A. 

Ramachandran plot statistics for the S-RNases showed than 
97% amino acid residues were positioned in the "allowed" regions. 
In fact, when structures place 95-97% or more of the amino acid 
residues in the "allowed" positions, they are considered to be reli- 
able in modeling experiments, and this indicates how well the 
structures fits with the expected main chain length and torsion 
angle distributions (Laskowski etal, 1993; Kleywegt and Jones, 
1996). The best four models of the Sy-, S23-, and Sg-RNases 
were selected for further modeling. As shown in Figure 1, in 



the model sfil_BL00040001 (Figure lA), all residues were posi- 
tioned in the allowed region (red arrow), whereas in the model 
sfil_BL00010001 (Figure IB), 1.6% of the residues were in a 
disallowed region (green arrow). Thus, the model BL00040001 
was selected as the best model to be analyzed. Higher numbers 
of residues in the disallowed region reflect a distorted geom- 
etry in the models, because there are higher proportions of 
residues falling outside the limits of main chain bond length and 
torsion angles derived from the small molecule library (Engh 
and Ruber, 1991). These results indicate that our models were 
optimal. 

When the three S-RNases were superpositioned, the Sy^ -RNase 
structure contained an additional extended loop, which was not 
present in the Ss or S23 models. This loop, shown in Figures 2 
and 3 contained the amino acid residues CKG NPQ RQA KSQ 




FIGURE 2 I Ribbon diagram (A) and surface representation (B) of nnodeled 
structure of ainnond Sf-RNase (nnagenta), Sg-RNase (cyan), and S23-RNase 
(red), showing secondary structural elements and surfaces. Black arrow 
indicates the "long loop" found in the Sf model. 
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FIGURE 3 I Multiple sequence alignment of the Sf, Sg, and S23 sequences of RNases, indicating the amino acid residues that belong to the "extended 
loop" in the Sf 3D structure (marked in blue). 



www.frontiersin.org 



June 2012 | Volume 3 | Article 139 | 3 



Fernandez i Marti etal. 



3D modeling of S-RNases in almond 



PKN RGK SQP KSQ ATT QFL, which were placed between the 
conserved domains RC4 and C5. Through the software PYMOL, it 
has been possible to visualize which amino acid residues comprised 
a-helices, ^-strands, and loops. It has been suggested that loops 
in 3D structures serve to interconnect a-helices and P-strands, 
and also that longer surface-exposed loops could be susceptible to 
proteolytic degradation (Branden and Tooze, 1998). As the main 
structural difference found between the Sy-, Sg-, and S23-RNases 
resides in the presence of this "extended looping region," this long 
loop could be prone to degradation and, as a consequence, this 
5-RNase could be less stable. As a result of this possible degra- 
dation, the pollen tube could grow through its own pistil giving 
rise to SC. 

Additionally, the 3D models of the Sy-, Sg-, and S23-RNases 
were compared with that of another Rosaceous species, the Pyrus 
pyrifolia Ss-RNase (Matsuura etal., 2001). The structure of the 
pear S3-RNase was consistent with the models of the almond Sg 
and S23 RNase. The fact that both the pear and almond S-RNases 
confer SI, and that their models did not contain this extended loop, 
the main structural differences between the SI and the SC RNases 
could reside in the presence of the loop. Therefore, the amino acid 
residues that form the extended loop positioned at the surface of 
the Sy- RNase (Figure 2), between the conserved domains RC4 and 
C5, could be responsible for the differences in function of RNases. 



Further studies are required to ascertain how this loop medi- 
ates SI and to reveal if it indeed is involved in the SI mechanism 
in plants. 

CONCLUSION 

The molecular nature of the S-locus has been widely studied in 
many species. In spite of the knowledge on the genetic struc- 
ture of the female (RNase) and male (SFB) determinants of 
SI, the nature of their interaction remains unclear. The pres- 
ence of a loop in the predicted structure of Sf is a remarkable 
difference between this S-RNase allele and the other two ana- 
lyzed in this work, thus we may suggest that this extended loop 
may be involved in the manifestation of self- (in) compatibility 
in almond. However, further details and approaches are strongly 
needed in order to better understand the role of this loop in the 
SI complex. 
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